Antunes et at. BMC Biotechnology 2012, 12:86 
http://www.bionnedcentral.conn/1472-6750/12/86 



Biotechnology 



RESEARCH ARTICLE Open Access 



Targeted DNA excision in Arabidopsis by a 
re-engineered homing endonuclease 

Mauricio S Antunes\ J Jeff Smith^, Derek Jantz^ and June I Medford^'' 



Abstract 

Background: A systematic method for plant genome manipulation is a major aim of plant biotechnology. One 
approach to achieving this involves producing a double-strand DNA break at a genomic target site followed by the 
introduction or removal of DNA sequences by cellular DNA repair. Hence, a site-specific endonuclease capable of 
targeting double-strand breaks to unique locations in the plant genome is needed. 

Results: We engineered and tested a synthetic homing endonuclease, PBl, derived from the l-Crel endonuclease of 
Chlamydomonas reinhardtii, which was re-designed to recognize and cleave a newly specified DNA sequence. We 
demonstrate that an activity-optimized version of the PBl endonuclease, under the control of a heat-inducible 
promoter, is capable of targeting DNA breaks to an introduced PBl recognition site in the genome of Arabidopsis 
thaliana. We further demonstrate that this engineered endonuclease can very efficiently excise unwanted 
transgenic DNA, such as an herbicide resistance marker, from the genome when the marker gene is flanked by PBl 
recognition sites. Interestingly, under certain conditions the repair of the DNA junctions resulted in a conservative 
pairing of recognition half sites to remove the intervening DNA and reconstitute a single functional recognition 
site. 

Conclusion: These results establish parameters needed to use engineered homing endonucleases for the 
modification of endogenous loci in plant genomes. 
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Background 

The ability to genetically engineer plants has matured 
over the past 25 years, producing agronomic products 
with superior traits, and also, controversy. One source of 
significant objection to genetically engineered plants is 
the presence of antibiotic or herbicide resistance genes, 
frequently called selectable markers', in crops and foods 
[1]. The recent approval, by the Chinese Ministry of 
Agriculture, of field tests for transgenic rice and maize 
expressing the Bacillus thuringiensis toxin and develop- 
ment of herbicide resistance traits in crops has heigh- 
tened concerns {Nature Biotechnology ^ 28, 390-391, May 
2010). Many of these apprehensions could be alleviated 
if genetically engineered plants could be produced with- 
out selectable markers. Methods to do so are largely im- 
practical because the frequency of stably introducing 
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genes in plants is low. An alternative approach is to use 
a selectable marker for the transformation process, fol- 
lowed by the removal of the marker gene after the trans- 
genic plant is obtained. Previous studies have shown this 
to be possible [2-7]. 

Because the ability to target genetic modifications to 
specific sites in a plant genome would facilitate both plant 
research and the ability to better modify commercially im- 
portant crop plants, many approaches have been previ- 
ously tried. One approach, based on homologous 
recombination (HR), typically used in yeast and mamma- 
lian cells, is largely ineffective in plants. This inefficiency is 
widely thought to be a result of a low rate of somatic re- 
combination in plants and the preferential repair of DNA 
breaks by non-homologous end-joining (NHEJ). Conse- 
quently, successful un-stimulated homologous gene inte- 
gration in plants requires large-scale screening procedures 
and strong positive/negative selection to identify a small 
number of events [8,9]. Another strategy is to improve 
homologous gene integration in plants by over-expressing 
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genes involved in homologous recombination. For ex- 
ample, Arabidopsis plants expressing the yeast RAD54 
gene, encoding a chromatin remodeling protein, increased 
the homologous recombination frequency one to two 
orders of magnitude [10]. However, the frequency of tar- 
geted transgene integration to an endogenous site is ap- 
proximately 0.01% to 0.1% in plants [11]. 

An alternative, and more widely investigated, strategy 
for the targeted modification of plant genomes is the 
production of a DNA break at a unique chromosomal 
location using a site-specific endonuclease that recog- 
nizes a relatively long, and therefore unique, DNA se- 
quence. Targeted chromosomal DNA breaks can be 
exploited to produce a wide range of genome modifica- 
tions including targeted gene insertion [12-15], gene ex- 
cision [16], and gene knock-out [17]. The effectiveness 
of this strategy has been demonstrated in Arabidopsis, 
tobacco, and maize. In these experiments, a DNA break 
was produced in the plant genome using a rare-cutting 
LAGLIDADG homing endonuclease, either the I-Scel 
enzyme from S. cerevisiae, or I-Ceul from C. eugametos 
[12,13]. Because recognition sites for these enzymes do 
not occur naturally in the plant genome, it was neces- 
sary, in each case, to insert an endonuclease recognition 
site into the genome prior to targeting it with the corre- 
sponding endonuclease. This need to "pre-engineer" 
plants to incorporate an endonuclease site limits the 
utility of natural (unmodified) homing endonucleases as 
genome engineering tools. 

A promising alternative to natural rare-cutting endo- 
nucleases is the production of engineered DNA-cleaving 
enzymes that can be directed to existing, user-specified 
locations in a plant genome. One such approach that 
has garnered attention is utilization of zinc-finger 
nucleases (ZFNs) [18,19]. ZFNs, chimeric fusions be- 
tween a zinc-finger DNA binding domain and the Fokl 
nuclease domain, have the ability to recognize and cut 
existing sites in a genome because the zinc-finger do- 
main can be engineered to recognize a variety of differ- 
ent DNA sequences. The power of ZFNs as genome 
modification reagents is highlighted by several publica- 
tions in which engineered ZFNs were used to target 
homologous integration at native sites in the human 
genome [20-24]. ZFNs have also been tested in Arabi- 
dopsis, tobacco, and maize and shown to be capable of 
targeting mutations to introduced sites by NHEJ and HR 
with frequencies as high as 16% and 2%, respectively 
[25,26]. However, two significant limitations of ZFN are 
reported: (1) toxicity in plants and mammalian cells, 
presumed to be caused by "off-site" cleavage [27,28], and 
(2) imprecise events associated with their cleavage (e.g., 
deletions, small insertions) [29]. In addition, a similar 
approach to ZFNs has been obtained by fusing the Fokl 
domain to transcription activator-like (TAL) effector 



proteins identified in plant pathogenic bacteria from the 
genus Xanthomonas, These TAL effector nucleases 
(TALEN) have been shown to successfully create tar- 
geted double-strand breaks in mammalian cells and 
plant protoplasts [30-32]. While the versatility of ZFNs 
and TALEN lies in their ability to be engineered to 
recognize widely divergent DNA sequences, recent pub- 
lications show that this versatility can be introduced into 
other endonucleases. For example, protein engineering 
has also been applied to LAGLIDADG homing endonu- 
cleases [33-35]. These "custom" endonucleases derived 
from I-Scel and its homologs, I-Msol and I-Crel, have 
also been shown to target DNA breaks in bacteria, yeast, 
and mammalian cell lines. More recently Fauser et al 
(2012) reported a highly efficient gene targeting system 
in Arabidopsis that also uses a site-specific endonucle- 
ase. The improvement relies on the fact that the enzyme 
cuts both within the target and the chromosomal trans- 
genic donor, leading to an excised targeting vector [36]. 

We report here that an engineered homing endonucle- 
ase can be used to target DNA breaks in a higher plant. 
To demonstrate the strength of using rationally designed 
homing endonucleases for plant genome engineering, we 
produced an endonuclease called "PBl", derived from 
the natural I-Crel endonuclease, but which recognizes 
and cuts a very different DNA sequence. We show that 
this enzyme can efficiently cleave its intended recogni- 
tion sequence present on a stably integrated transgene 
in the Arabidopsis genome. We report that optimal in 
planta cleavage requires the addition of an N-terminal 
nuclear localization signal and introduction of a point 
mutation to increase DNA cleavage activity. Lastly, we 
demonstrate that this optimized PBl endonuclease can 
be used to efficiently excise an herbicide resistance mar- 
ker from transgenic Arabidopsis plants when the marker 
is flanked by recognition sequences for the enzyme. 
These results show that rationally designed endonu- 
cleases derived from I-Crel may prove to be highly 
adaptable tools for plant genome engineering. 

Results 

Production and in vitro analysis of the PBl endonucleases 

The native enzyme, I-Crel, is a homodimer whose nat- 
ural function is recognition and cleavage of a 22 bp 
DNA sequence in the Chlamydomonas reinhardtii 
chloroplast genome [37]. Figure lA diagrams how the I- 
Crel protein contacts the 22 bp cleavage site. Each 
monomer of the homodimer makes direct and water- 
mediated contacts with a nine base-pair "half-site". The 
two half-sites, inverted with respect to one another, are 
separated by a four base-pair center sequence that the 
endonuclease does not directly contact. The enzyme 
cleaves the phosphodiester bonds on either side of this 
center sequence, leaving two stretches of unpaired four 
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Figure 1 DNA-protein interactions of endonucleases and their in vitro cleavage of distinct DNA substrates. (A) Diagram of wild-type I- 
Crel liomodimer in complex with its natural DNA recognition sequence. One l-Crel monomer is shown in white, the other (l-Crel') in grey. DNA 
sequence is indicated, with four base-pair center sequence shown in bold. Direct hydrogen bonds between l-Crel and DNA are shown as black 
lines. Sites of phosphodiester bond cleavage and the resulting 4 bp 3' overhangs are indicated by a line. A likely unfavorable electrostatic 
interaction between E80 and a backbone phosphate is indicated by a small arrow. (B) Predicted interactions between rationally designed PBl 
endonuclease and the RSgtac DNA site. The two monomers (PBl and PBl') and DNA interactions are as indicated in (A), except amino acids that 
deviate from l-Crel and l-Crel' hydrogen bonds (or a hydrophobic interaction, C33) and are predicted to contribute to altered DNA-cleavage 
specificity are indicated with dashed lines. PBl-i- endonuclease contains a mutation (E80 to Q80) predicted to eliminate the unfavorable 
interaction mentioned in (A). (C) Cleavage of DNA by native and rationally designed endonucleases. l-Crel, PBl, and PB1+ endonucleases were 
incubated with three distinct linearized DNA substrates (sequence indicated above its respective set of digests). Sequence differences between I- 
Crel (wild-type) and the two PBl recognition sites highlighted in grey. DNA for PBIjaga (center) and PBIgtac (right) differ from each other by the 
4 bp center sequence (subscript). Digests were conducted with 0, 0.007, 0.015, 0.031, 0.062, 0.125, 0.25, 0.5, 1, 2 mM endonuclease. 



base-pair 3' DNA overhangs. Structural analyses of I- 
Crel in complex with a variety of DNA sites reveal that 
the enzyme has a relatively simple DNA recognition 
mechanism by which individual bases in the cleavage se- 
quence are specified through direct contacts with a sin- 
gle amino acid side chain [38-40]. This mechanism lends 
itself to the production of engineered endonucleases 
with altered cleavage site preferences because, first, indi- 
vidual base preferences can be changed by mutating a 
small number (1-3) of amino acids in the enzyme, and 
second, the mutations that affect individual base prefer- 
ences are largely independent of one another, allowing 
"mixing and matching" to produce endonucleases with 
comprehensively redesigned DNA recognition properties 
[34,41]. 

To determine whether an engineered endonuclease can 
specifically direct DNA cleavage to an introduced site in 
a plant genome, a structure-based design strategy was 
employed. The PBl endonucleases were designed to 
recognize a nine base-pair half-site 5'-CTCCGGGTC-3' 
that differs at five out of nine bases from the half-site 
recognized by the native I-Crel enzyme, 5^-C AAA(A/C)- 



(C/T) GTC-3^ (bases where the two differ are underlined). 
Because the enzyme is a homodimer, we predict that the 
re-designed PBl should recognize and cleave the 22 
base-pair recognition sequence 5'-CTCCGGGTC-NNNN- 
GACCCGGAG-3, where NNNN is a highly variable four 
base-pair center sequence. We introduced eight amino 
acid changes into the endonuclease monomer in order to 
alter the sequence recognition of the resulting PBl endo- 
nuclease (Figure IB). In addition, because we previously 
observed that alteration of the glutamic acid residue at 
position 80 to glutamine (E80Q) increases the overall ac- 
tivity of the endonuclease without affecting its cleavage 
site preference, we also incorporated this change in PBl 
to produce a higher activity endonuclease, referred to 
later in the text as PBl -h. 

The PBl endonuclease variants, as well as wild- type I- 
Crel, were expressed in E, coli, purified, and evaluated 
in vitro for the ability to cleave DNA substrates contain- 
ing the intended target recognition sites (RS). Figure IC 
shows that the PBl and PBl-i- endonucleases efficiently 
cleave their intended recognition site but do not cleave 
the wild- type recognition site. As predicted, the PB1+ 
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endonuclease (bottom row) cleaves its intended site 
more efficiently than PBl (center row), which lacks the 
E80Q mutation. The crystal structure of I-Crel in com- 
plex with its preferred DNA site suggests that the center 
sequence does not play a major role in I-Crel recogni- 
tion [38], however, some cleavage studies have indicated 
that certain central four base pair sequences are cut 
more efficiently. To test the impact of the central four 
base pair sequence, we compared cleavage of DNA 
substrates that differ only at these center four base 
pairs. Figure IC shows a higher PBl cleavage efficiency 
using a DNA substrate with the I-Crel consensus cen- 
ter sequence (5'-GTAC-3', denoted RSgtac) compared 
to a DNA substrate with a differing center sequence 
(5'-TAGA-3', denoted RStaga). 



PBl can cleave an introduced recognition site in planta 

To determine the requirements for engineered endo- 
nuclease function in plants, we conducted a series of 
experiments using the PBl and PB1+ endonucleases and 
two introduced recognition sites flanking a Pstl site 
(Figure 2A). Arabidopsis plants were individually trans- 
formed with seven different T-DNA constructs encoding 
the PBl (JJS22, JJS23, and JJS26) or PB1+ (JJS20, JJS21, 
JJS24 and JJS25) endonucleases under the control of a 
heat-shock inducible promoter (Figure 2B). Distinct 
endonuclease and RS sites allowed us to test various 
aspects about function of the synthetic endonucleases in 
plants. First, we tested whether a nuclear localization 
signal (NLS) is needed for endonuclease function by in- 
cluding the SV40 NLS in four of these constructs (JJS20, 
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Figure 2 In planta cleavage of PBl recognition sites by engineered endonucleases. (A) T-DNA structure before and after induction of tine 
endonuclease. Endonuclease cleavage excises the central fragment 5'-^CTGCAG-3', eliminating the indicated Pstl site. RB, right border; HSP, 
Hsp18.2 promoter; Endo, PBl or PB1+, endonuclease; T, Nopaline Synthase terminator; RS, endonuclease recognition site (RSjaga or RSgtac); Kan, 
kanamycin resistance marker; LB, left border. Horizontal arrows indicate approximate locations of PCR primers used for diagnostic evidence of /n 
planta endonuclease cleavage. (B) Table of experimental results. Seven different T-DNA constructs used in this study, with the general form 
diagramed in (A). Each T-DNA has three possible differences: presence (Yes) or absence (No) of a nuclear localization signal (NLS) on the 
endonuclease; the endonuclease with either the lower activity PBl or higher activity PB1+ (containing Q80E mutation) PBl recognition sites (RS) 
contain either a TAGA or GTAC central 4 bp sequence. Plants containing some constructs (JJS20, 23, and 26) had a low recovery rate after heat 
shock treatment, resulting in a lower number of plants screened. (C) Sample agarose gel data showing loss of the Pstl restriction site from 
genomic DNA following heat-shock treatment of JJS24 plants. The agarose gel shows three JJS24 samples that demonstrated loss of the Pstl site. 
Control (C) shows size of uncut PCR fragment. PCR fragments from samples before heat shock (-) are cut >90% into smaller bands (identified as 
"cut" on left). After heat shock (+), PCR fragments from the three samples are largely uncut by Pstl, indicating a loss of the Pstl site in planta. 
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JJS22, JJS24, and JJS26). Second, we tested the ability of 
the PBl endonucleases to cleave recognition sites with 
the I-Crel consensus center sequence, RSgatc (JJS24, 
JJS25, and JJS26), or distinct from the consensus se- 
quence, RStaga (JJS20, JJS21, JJS22, and JJS23). Third, 
we tested in planta function of the E80Q mutation (PBl 
and PB1+), which is thought to provide a more favorable 
interaction of the endonuclease and DNA backbone. 

We produced at least 20 independent primary trans- 
genic plants (Ti) for each distinct T-DNA. To test the 
function of the two PBl enzymes and RS in plants, we 
induced expression of the endonucleases by subjecting 
plants to a heat-shock treatment and harvested individ- 
ual leaves for analysis. Western blot analyses confirmed 
that the endonuclease was not expressed at detectable 
levels prior to heat shock, with expression strongly 
induced by the two-hour heat shock (data not shown). 
Genomic DNA was isolated from comparable leaves be- 
fore and after induction then analyzed to determine 
whether the PBl endonucleases function in plants 
(Figure 2B, 2C, and Additional file 1). As an initial test 
for PBl function in plants, we used PGR to amplify a 
genomic fragment that encompasses the pair of RS and 
tested for the presence or absence of the Pstl site. If 
both RS are cleaved by the engineered endonuclease, an 
intervening fragment is excised, removing the Pstl site. 
Alternatively, cleavage of one site could produce a dele- 
tion of the Pstl site during non-homologous end joining 
repair of the break. We scored our DNA as "intact", if 
greater than 90% of the amplified DNA was digested 
with Pstl, or "cleaved", if a substantial amount of the leaf 
DNA (represented by greater than 30% of the amplified 
DNA) was resistant to Pstl digestion, suggesting loss of 
the internal fragment. We only counted samples as 
"cleaved" if the unheated control sample showed signifi- 
cant Pstl digestion or, in a few cases, if the unheated 
sample did not PGR amplify, then a sample was counted 
as "cleaved" only if greater than 80% were not digested 
by Pstl, In a few cases, both the heat-shocked and non- 
heat-shocked samples were similarly resistant to Pstl 
digestion. These samples may have integrated the endo- 
nuclease gene next to an endogenous promoter or en- 
hancer such that the endonuclease was expressed in the 
absence of induction. These samples were not counted 
as positive results. 

Genomic DNA samples isolated from all transgenic 
plants before PBl induction contain the intact Pstl site 
(Figure 2B), indicating that the recognition sites are in- 
tact prior to endonuclease expression. Similarly, plant 
lines (JJS20, JJS21, JJS22, JJS23) containing the four base- 
pair center sequence (RStaga) which differs from that 
found in the I-Grel crystal structure, also had intact Pstl 
sites even after induction of the PBl or PB1+ endonu- 
cleases. These results indicate that a differing four base- 



pair center sequence, which decreased the efficiency of 
the in vitro cleavage reaction, also hinders endonuclease 
function in planta. 

We then examined whether the designed PBl endo- 
nuclease cleaves plant DNA containing the four base- 
pair center sequence (RSgtac) found in the crystal 
structure described above. Three different lines (JJS24, 
JJS25 and JJS26) were generated with this RS flanking 
the Pstl site. Plants were treated as described above and 
genomic DNA analyzed before and after induction of 
the endonucleases. Plant lines containing JJS26 express 
the PBl endonuclease with the naturally occurring E80 
residue, and upon induction of the PBl endonuclease, 
the Pstl site is intact. In contrast, plant lines (JJS24 and 
JJS25) containing the PB1+ endonuclease with the Q80 
mutation, lose the internal Pstl site after endonuclease 
induction (Figure 2B, 2G, and Additional file 1). These 
results suggest an in planta requirement for the favor- 
able protein-DNA contact of Glutamine (Q) at position 
80, which improves the cleavage activity of PB1+. Simi- 
larly, a need for an NLS on the engineered PBl endonu- 
cleases is also demonstrated, whereby nineteen out of 
thirty-six independent transgenic plants with the NLS 
(JJS24) had PB1+ cleavage, compared to two out of 
twenty- six independent transgenic plants without the 
NLS (JJS25) (Figure 2B, 2G, and Additional file 1). 

Genomic DNA from the PGR-amplified region both 
before and after induction of the endonuclease was 
cloned and the DNA sequence determined. All cloned 
fragments from non-heat-shocked plants have genomic 
DNA sequences that are indistinguishable from the ori- 
ginally introduced T-DNA (data not shown). In contrast, 
genomic DNA clones from the heat-shocked plants have 
the Pstl site deleted with frequencies ranging from 46% 
to 63% in the case of JJS24, or 49% in the case of JJS25. 
Unexpectedly, 100% (23 out of 23, representing eight in- 
dependent transgenic plants) of the clones that lacked 
the Pstl site had a very precise deletion of the DNA se- 
quence intervening the two RSgtac cut sites with recon- 
stitution of a new RSgtac cut site (as drawn in 
Figure 2A), suggesting repair by simple re-ligation of the 
two cut ends. From these data, we conclude that an 
engineered PBl homing endonuclease is capable of 
cleaving an integrated recognition site in planta. How- 
ever, only the activity-optimized PB1+ enzyme yielded 
detectable cleavage of the genomic DNA site, suggesting 
a higher activity requirement in plants as opposed to 
in vitro assays. 

Engineered endonuclease excises a selectable marker in 
transgenic plants 

To determine if the length of DNA separating a pair of 
PBl recognition sequences affects the ability of the PBl 
endonuclease to cleave both sites and remove the 
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intervening sequence, we modified the JJS24 T-DNA so 
that the phosphinothricin acetyltransferase (BAR) gene, 
encoding resistance to the Basta® herbicide (under con- 
trol of the NopaUne Synthase promoter), is inserted into 
the Pstl restriction site, producing JJS30 (Figure 3A). 
This modified T-DNA was introduced into Arabidopsis, 
and transgenic plants were selected for resistance to 
kanamycin and Basta®. We analyzed twenty-two inde- 
pendent Ti (primary transformant) plants for the pres- 
ence and absence of the BAR gene before and after 
induction of the PB1+ endonuclease with heat shock (as 
described above). Figure 3B shows that genomic DNA 
isolated prior to heat-shock primarily yields a PGR prod- 
uct approximately 1200 bp in length, consistent with the 
original introduced T-DNA containing the BAR marker. 
A second prominent genomic PGR product was found 
in 16 of the 22 plants (first 12 shown in Figure 3B; 
Additional file 2) after induction of PB1+ by heat-shock. 
These PGR products are approximately 300 bp in length, 
suggesting excision of the BAR marker in the plants. For 
plants one, three, five and twelve, excision of the BAR 
gene appears to be more efficient than for the others 
(Figure 3B). Plants nineteen and twenty-one produced a 
300 bp band in the absence of the heat shock. This unin- 
duced BAR removal may have resulted from elevated 
"leaky" expression of the PB1+ endonuclease due to 



integration of the endonuclease gene next to a strong 
promoter or enhancer in the genome. Although the 300 
bp band intensity appears to increase after heat shock, 
these samples were not counted as positive results and 
were not further analyzed. 

To determine if the smaller PGR fragment truly repre- 
sents excision of the BAR gene, we cloned this product 
from ten heat-shocked independent Ti plants and deter- 
mined their DNA sequence. DNA from these ten inde- 
pendent Ti plants, representing a total of 49 sequenced 
clones from individual bacterial colonies, confirmed re- 
moval of the BAR gene, from between the two RSgtac 
sites (Figure 3G). Four independent Ti plants (five PGR 
clones) that had excised the BAR gene did so in a manner 
that precisely reconstituted the RSgtac site, again consist- 
ent with cleavage of the T-DNA followed by simple re- 
ligation (Figure 3G, and Additional file 3). The remaining 
plants and clones had small deletions 3-47 base pairs in 
length. It is also possible that there are other deletions that 
our cloning methodology may miss, for example, larger 
deletions that extend beyond the priming sites used for 
our PGR based analyses, or DNA breaks at non-intended 
sites, as was recently observed in human cells that had 
undergone gene therapy with engineered ZFNs [42]. 

Three Ti plants from the BAR removal experiment 
that showed clean excision by our PGR assay were 
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Figure 3 Induction of PB1+ endonuclease removes BAR gene from Arabidopsis plants. (A) Schematic of the JJS30 T-DNA before and after 
induction of the PB1+ endonuclease. Two RSgtac sites flanl< the BAR gene, so that induction of the endonuclease excises the herbicide resistance 
gene from the genome. The heat-inducible promoter Hspl8.2 controls expression of PB1+. Arrows indicate location of PCR primers used to assay 
for BAR excision. (B) PCR analysis of JJS30 primary transformants before and after heat-shock, using primers shown in (A). Unmodified JJS30 T- 
DNA yields a PCR product approximately 1200 bp in length (BAR+), whereas JJS30 lacking the BAR gene is approximately 300 bp (BAR-). (C) DNA 
sequence of repair junctions from BAR- clones. The approximately 300 bp PCR products from (B) were cloned and sequenced to evaluate the 
DNA repair junctions. Forty-six clones were evaluated that represented ten plants yielding a significant amount of BAR minus (-) PCR product 
(excluding plants 2 and 11). Ten unique sequences were obtained and these are aligned with the "perfect re-ligation" product (sequence 1), in 
which the reconstituted PBI recognition site is shaded and the location of phosphodiester bond cleavage/re-ligation is indicated by the 
arrowhead. Total number of independent clones that yielded each sequence is indicated, as well as the number of individual transformed plants 
that yielded those clones. Bases that are conserved between the two halves of the repair junction (microhomology) are underlined. Single and 
double base insertions at the repair junction are shown in lower case (sequence 8 and 4, respectively). 
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allowed to self-fertilize, and progeny that contained the 
T-DNA was selected by germinating seed on medium 
with kanamycin. To determine if excision of the BAR gene 
is a genomic change that is inherited in the T2 progeny 
we "painted" leaves from each plant with the Basta® 
herbicide. Nineteen of these T2 plants, representing all 
three Ti plants, were identified as kanamycin resistant, 
Basta® sensitive. We excised one leaf from each plant and 
used PGR to confirm that they contain the JJS30 T-DNA 
but lack the BAR gene. Three of the nineteen plants 
completely lacked a BAR gene (Additional file 4). The 
remaining sixteen plants contained some portion of cells 
with an intact BAR gene that was either silenced or in- 
correctly identified as Basta® sensitive. These chimeric 
plants were not analyzed further. The PGR products 
obtained from the three T2 plants lacking the BAR gene 
were cloned, and eight clones resulting from each PGR 
product were sequenced. In clones obtained from one of 
these three plants, the DNA sequence is consistent with 
another T-DNA integration or a rearrangement during 
integration that mutated the BAR gene. This plant was 
likewise not analyzed further. In DNA from two of the 
three T2 plants, all eight clones from the same plant 
contained the same DNA sequence lacking the BAR 
gene, distinctive from the mixed sequences in leaves of 
induced primary transformants (Ti plants). However, 
further attempts to find T3 plants containing only the 
BAR-lacking T-DNA were unsuccessful (data not shown), 
indicating that excision of the BAR gene does not occur 
in stem cells or is an extremely rare occurrence. Also of 
note is that one of the two observed T2 plants contained 
a reconstituted RSgtac site. 

Discussion 

Re-design of endonucleases is a powerful approach to- 
wards precise modification of plant and mammalian 
genomes. Seligman et al [41] previously changed the I- 
Grel endonuclease at position G33 producing altered 
DNA recognition. We engineered seven changes in I- 
Grel to produce the PB1+ endonuclease and show that 
this engineered homing endonuclease is capable of tar- 
geting an introduced site within the plant genome. We 
report that the in planta cleavage of a pair of juxtaposed 
PBl endonuclease recognition sites, as in the JJS24 and 
JJS25 constructs, results in the precise excision of the 
intervening DNA sequence with the reconstitution of a 
functional recognition site. These results are some- 
what contrary to the widely-held notion that NHEJ, 
the dominant form of DNA repair in plants, is gener- 
ally mutagenic [43]. This type of "perfect re-ligation" is 
not entirely without precedent, however. For example, 
Siebert and Puchta observed analogous excision and re- 
ligation using a pair of I-Scel endonuclease sites in trans- 
genic tobacco [16]. The frequency of perfect re-ligation 



in these experiments was low, however, relative to the 
frequency of mutagenic repair [15]. Because DSB repair 
in plants is thought to occur primarily through a single- 
strand annealing (SSA) mechanism that requires short 
regions of homology between DNA ends at the repair 
junction, one possibility is that the observed perfect re- 
ligation was due to cleavage of one of the two recognition 
sites with subsequent repair by SSA (or an SSA-like 
mechanism) at the second site. Another possible repair 
mechanism may have involved cleavage at both recogni- 
tion sites and subsequent re-ligation of the two "sticky" 
ends after loss of the intervening DNA. Our current 
results cannot distinguish between these two possible 
repair mechanisms or eliminate the possibility that some 
Pstl minus samples were produced without a need for 
the PBl endonuclease. By comparing heat-shocked and 
non-heat-shocked samples, the data clearly demonstrate 
that the PB1+ endonuclease stimulates the loss of the 
Pstl site. Obtaining a single repair junction from mul- 
tiple independent plants is noteworthy, especially con- 
sidering that due to the experimental setup each plant 
cell within the leaf constitutes an independent cleavage 
event that could have resulted in a different repair junc- 
tion outcome. 

Our results with the removal of the BAR gene 
(Figure 3) are more consistent with current models of 
DNA repair in plants (reviewed in [43]; [44]). In this 
case, positioning the two PBl recognition sites approxi- 
mately 1 kb apart resulted in a much lower frequency 
of perfect re-ligation. Ninety percent of the clones 
sequenced from ten independent JJS30 plants exhibited 
additional DNA deletion from the region flanking the 
PBl recognition sites and the observed deletions are de- 
cidedly non-random. Only nine unique deletions were 
detected in 48 sequenced clones (Figure 3G). In par- 
ticular, sequences 5, 6, 7, and 9 were obtained multiple 
times from multiple independent plants (Additional file 3). 
Because the endonuclease was activated in mature plants 
each cell constitutes an independent cleavage and repair 
event. As expected, the BAR removal results were 
chimeric but, similarly to the Pstl removal results, it is 
interesting that the same repair junctions were found re- 
peatedly. In each case there is a 3-5 bp "microhomology" 
at the junction, suggesting a SSA-like mechanism of 
repair (microhomologies are underlined in Figure 3G). 
The existence of short patches of homology at DNA 
repair junctions is a characteristic feature of DNA repair 
by SSA in plants [17,45,46] and other eukaryotes [47,48]. 
The number of possible repair junctions may be limited 
by the preference for these microhomologies. 

Another significant finding is the comparison between 
endonuclease activity determined in vitro and the activity 
observed in planta. For example, we observed significant 
in vitro DNA cleavage activity by the PBl endonuclease 
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(Figure IB), yet, only the more active PB1+ endonuclease 
had detectable function in plants. Likewise, although the 
RStaga sequence could be cleaved in vitro, only the pre- 
ferred RSgtac sequence appears to be a suitable cleavage 
substrate in planta. One possibility is that there is an 
"activity threshold" that an endonuclease must achieve 
before it is able to function in vivo and that this thresh- 
old is higher than what is required for in vitro cleavage 
of plasmid DNA. Interestingly, a single amino acid sub- 
stitution accounts for the difference between PBl lying 
below the threshold and PB1+ lying above, indicating 
that very minor changes can determine success or failure 
in vivo. When this threshold of activity is achieved, how- 
ever, as is the case for the PB1+ endonuclease paired 
with the RSgtac recognition sequence, in planta cleav- 
age of the recognition sequence is remarkably efficient. 
This "all or nothing" feature of our in planta cleavage 
results suggests that the observed differences in cleavage 
efficiency are not merely due to reduced endonuclease 
expression levels in plants. Rather, there appear to be 
intrinsic differences between in vitro and in planta endo- 
nuclease function that could be due to differences in en- 
vironment {e,g„ pH or solute concentrations) or, more 
likely, due to differences between plasmid and genomic 
DNA as a cleavage substrate. The chromatin structure of 
plant genomic DNA is a likely factor restricting accessi- 
bility of the endonuclease to DNA, thereby reducing 
its efficiency in vivo. Several studies suggest that alter- 
ing chromatin in planta aids HR and gene targeting 
[10,49,50]. In our work, the heat-shock treatment used to 
induce the PB1+ endonuclease is also known to alter 
chromatin, and may make the recognition site more 
accessible to the endonuclease. It is also possible that this 
"activity threshold" is not unique to the PBl endonu- 
cleases and is a more general characteristic of I-Crel and 
engineered homing endonucleases derived from it. 

Though we have undertaken great effort to replicate 
the in planta experiments reported here using wild-type 
I-Crel, we have been unable to obtain Arabidopsis trans - 
formants with the wild-type endonuclease gene, perhaps 
due to leaky expression of the endonuclease resulting in 
toxicity. Wild-type I-Crel is known to be highly promis- 
cuous in its cleavage site selection and toxic to a wide 
range of cell types [41,51-53], and the toxicity mechan- 
ism of wild-type I-Crel may parallel the toxicity mechan- 
ism of engineered ZFNs [54]. In contrast to the wild-type 
I-Crel, we observed no evidence of toxicity due to ex- 
pression of the PBl or PB1+ endonucleases. All plants 
are phenotypically normal and healthy third-generation 
plants containing the endonuclease-modified JJS24 and 
JJS30 transgenes have been produced. Recently, we 
demonstrated that another engineered endonuclease suc- 
cessfully targets an endogenous locus in maize, generat- 
ing heritable deletions at the endogenous target site [34]. 



However, in the present work we were unable to find T3 
or T4 generation Arabidopsis plants where all the cells 
only contained the BAR- T-DNA (data not shown), sug- 
gesting that meganuclease activity or activity of the heat 
inducible promoter controlling the meganuclease in stem 
cells is either absent or extremely rare. T3 and T4 gener- 
ation plants are chimeric for the deletions, possibly as a 
result of spurious activation of the heat-shock inducible 
promoter by some factor, such as stress, during plant 
growth and development. Basal levels of transcription 
from the heat-shock inducible promoter used in the 
present work {HSP18.2) have been reported in the litera- 
ture [55], and may explain the chimeric plants obtained. 

While the modification of endogenous genomic loci is 
one application for which this technology is being devel- 
oped, the PB1+ endonuclease is a valuable tool for plant 
biotechnology. Excising a selectable marker, such as the 
herbicide gene demonstrated here, can provide advanced 
crops and plant systems without objectionable DNA. The 
significance of our achievement is demonstrated in the 
numerous previous efforts towards this end. For example, 
previous reports have described the development of site- 
specific recombinases for marker-gene excision (for re- 
view, see [56-59]). Zinc finger nucleases have also recently 
been shown to remove an intervening transgene by flank- 
ing the transgene with recognition sites [7]. It is difficult 
to make any comparisons with this work however, because 
multiple tandem recognition sites were used on both sides 
of the transgene. In addition, pioneering work by Puchta 
and coworkers has demonstrated that the I-Scel homing 
endonuclease can be used to excise a selectable marker 
gene integrated between a pair of I-Scel recognition sites 
in transgenic tobacco at frequencies ranging from 19 to 
75% [16]. By flanking the recognition sites with a short 
stretch of duplicated DNA sequence, it was possible for 
these authors to obtain plants in which the I-Scel-induced 
DNA breaks were repaired through recombination be- 
tween the repeated sequences. The outcome of these 
events was the removal of both the selectable marker and 
the I-Scel recognition sites from the genome. Marker gene 
excision using a recombinase, in contrast, necessarily 
leaves the recognition site(s) behind in the genome. We 
demonstrated that the PB1+ endonuclease is capable of 
catalyzing the efficient removal of a selectable marker 
from Arabidopsis plants in a manner analogous to I-Scel. 
Because it is possible to engineer a large number of I-Crel 
variants that recognize widely divergent DNA sequences, 
it should be possible to independently manipulate mul- 
tiple T-DNAs and transgenes in the same plant by flank- 
ing the T-DNAs with different endonuclease recognition 
sites. In this study, the recognition sites for the endonucle- 
ase were introduced in order to simplify the experiments, 
by producing a pair of identical recognition sites flanking 
an easily monitored marker (Pstl site or BAR gene). Using 
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the criteria learned from these experiments however, it 
may also be possible to modify already integrated or en- 
dogenous sequences by custom engineering an endo- 
nuclease to recognize target sites within these sequences. 
For example, a custom meganuclease was engineered to 
target an endogenous sequence in maize [34]. The design 
process for a custom homing endonuclease is still more 
complex than designing a TAL or zinc finger nuclease, but 
numerous groups are working to routinely generate cus- 
tom meganucleases as a viable third option for genome 
engineering. Our system provides a clear alternative to 
TAL and zinc finger nucleases. Yet, given the effectiveness 
and ease of use of the TAL system, re-engineered homing 
endonucleases may have niche specific applications. 

Conclusions 

The results reported here constitute a significant step to- 
ward the use of engineered homing endonucleases for 
the modification of endogenous loci in plant genomes. 
Such alterations, removing selectable markers, targeted 
integration of transgenes, and modification of endogen- 
ous genes may go far to reduce public objections to gen- 
etically modified plants, enhancing biotechnology's 
ability to provide sustainable food and fuel resources. 

Methods 

Plant material, transgenic plant production and growth 
conditions 

Arabidopsis thaliana (ecotype Col-0) was used for 
transformation. Plasmids were assembled as described 
below and transferred into Agrobacterium tumefaciens 
strain GV3101 by electroporation. Arabidopsis plants 
were transformed by floral dip method [60]. Primary 
transgenic Ti plants were selected on culture medium 
containing full-strength MS media [61], 0.8% agar, pH 
5.7. Kanamycin (50 mg/L) (Sigma-Aldrich, St. Louis, 
MO), and/or glufosinate (5 mg/L) (Basta®; Crescent 
Chemical, Islandia, NY) were added to the medium as 
needed for the selection required for the transgenic plants. 
Ti lines were selected and allowed to self-pollinate. Single 
T-DNA insertion lines were identified by segregation of 
the Kanamycin resistance gene in the T2 generation. 
Transgenic seeds were sterilized and cold treated to 
synchronize germination for 1-3 days at 4°C, and were 
grown at 23-25°C under 16 hours light (70-100 (lE.m'^.s'^ 
fluorescent light)/ 8 hours dark cycle, in either a Percival 
AR75L growth chamber or light shelf. 

Synthesis of the PB1 and PB1+ vectors 

The PBl endonuclease was produced using the oligo- 
nucleotide overlap extension method [62] of PCR to 
introduce mutations into a codon-optimized version of 
the I-Crel monomer. To produce PBl, we introduced 



eight amino acid changes: Q26S, K28R, N30R, Y33C, 
Q38E, S40E, T42R, and I77R. PB1+ was produced 
by introducing the additional mutation E80Q to PBl. 
As detailed in the table of Figure 2, some plant T-DNA 
constructs included an SV40 nuclear localization signal 
(sequence MAPKKKRKVI) at the N-terminus of the 
endonuclease. Plant T-DNA constructs were assembled 
in pCAMBIA2300 vector. An enhanced CaMVSSS pro- 
moter with omega enhancer [63] and a Nos terminator 
were PCR amplified and subsequently fused to the endo- 
nuclease gene by overlapping oligonucleotide extension 
PCR. The full expression cassette was inserted between 
the Hindlll and BamHl sites of pCAMBIA2300. The 
pair of recognition sites with the intervening Pstl site 
was synthesized as oligonucleotides, phosphorylated 
with T4 polynucleotide kinase, annealed, and ligated be- 
tween the BamHl and Kpnl sites of pCAMBIA2300. 
The BAR expression cassette was PCR amplified from 
pCB302-3 [64] and inserted into the Pstl site of the 
JJS24 construct. 



Protein purification and in vitro endonuclease assay 

The coding sequences for PBl, PB1+, and wild- type I- 
Crel were subcloned into a bacterial expression vector 
(pET-21a, Novagen). Both genes carried a C-terminal 
six-histidine tag to facilitate purification. The histidine 
tag was omitted from constructs expressed in plants. 
BL21 (DE3) cells were transformed with each plasmid 
and cultured on standard 2x YT medium containing 
200 [ig/mL ampicillin. 

Protein expression was induced by addition of 1 mM 
IPTG after reducing the growth temperature from 37 to 
22°C. Three hours after induction, the cells were pel- 
leted by centrifugation for 10 min at 6,000 x g, and the 
pellets were resuspended in 1 mL binding buffer (20 mM 
Tris/HCl, pH 8.0, 500 mM NaCl, 10 mM imidazole) by 
vortexing. The cells were disrupted using 12 pulses of 
sonication (50% power), and the cell debris was pelleted 
by centrifugation for 15 min at 14,000 x g. The cell 
supernatant was diluted in 4 mL binding buffer and 
loaded onto a 200 [iL nickel- charged metal- chelating 
Sepharose column. The column was washed with 4 mL 
wash buffer (20 mM Tris/HCl, pH 8.0, 500 mM NaCl, 
60 mM imidazole) and then 0.2 mL elution buffer 
(20 mM Tris/HCl, pH 8.0, 500 mM NaCl, 400 mM 
imidazole). The enzymes were eluted in 0.6 mL elution 
buffer and concentrated to 50-130 [iL using Vivaspin 
disposable concentrators (ISC BioExpress). The enzymes 
were exchanged into SA buffer (25 mM Tris/HCl, pH 
8.0, 100 mM NaCl, 5 mM MgCl2, 5 mM EDTA) for 
assays and storage using Zeba spin desalting columns 
(Thermo Scientific). The purity and molecular weight of 
the enzymes were then confirmed by MALDI-TOF mass 
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spectrometry. For in vitro cleavage assays, 25 pmol of a 
pUC19 plasmid harboring the meganuclease recognition 
sequence was Unearized using Xmnl, then incubated with 
the indicated concentration of purified meganuclease for 
1 h at 37°C in 10 mM Tris, pH 8.0, 50 mM NaCl, 10 mM 
MgCl2. Reactions were stopped by addition of 0.5% SDS, 
25 mM EDTA and 10 \iL Proteinase K (New England 
BioLabs). After additional 1 h incubation at 37°C, plas- 
mid digestions were separated by gel electrophoresis, and 
the cut and uncut DNA bands were quantified using the 
ImageJ software (http://rsbweb.nih.gov/ij). 

Induction of expression of PB1 and PB1+ in plants 

Transgenic Ti plants were selected in MS media supple- 
mented with the appropriate selection agents as 
described above, and expression of the PBl and PB1+ 
endonucleases was induced by heat-shock when plants 
were three weeks old. Heat-shock treatment consisted in 
submerging Parafilm-sealed plates containing plants in a 
water bath at 40°C for two hours, according to [50]. For 
genomic DNA extraction and subsequent PGR analysis, 
one leaf was removed prior to the heat-shock treatment 
and quickly frozen in liquid N2 (- heat-shock sample), 
and another leaf was removed after plants were allowed 
to recover from the heat-shock treatment for 24 hours 
(+ heat-shock sample). 

PGR and Sequence analysis of recombination events 

Genomic DNA was isolated from Arabidopsis leaves using 
the Extract-N-Amp kit ( Sigma- Aldrich) according to the 
manufacturers instructions. The region of DNA encom- 
passing the PBl recognition sites was PGR amplified using 
the primers: 5'-GGTGTAGGGAATAGGAAAGG-3' and 
5'-GTGTAGAGAAATGTTGTGGGAGGTG-3'. For the 
initial set of experiments screening for the loss of a Pstl 
restriction site situated between the PBl recognition 
sites, the PGR amplified fragments were digested over- 
night at 37°G with 20 U Pstl (New England BioLabs) 
in Ix NEBS buffer. The digested products were resolved 
on a 2% agarose gel and visualized with ethidium 
bromide on a UV light source. For the BAR expression 
cassette removal experiment, the same region of the T- 
DNA was PGR amplified but the PGR products were 
directly resolved on a 1.5% agarose gel. PGR fragments 
corresponding to the loss of the BAR expression cas- 
sette (-300 bp) were excised from the gel and purified 
using QIAquick gel extraction kit (Qiagen). The purified 
PGR fragments were blunt-end cloned into the Smal 
site of pUG19 vector. Golonies containing inserts in the 
vector were identified by blue-white screening. Plasmid 
DNA was isolated using Qiagen DNA mini-prep kits 
and sequenced using the M13R primer (5'-GAGGAAA 
GAGGTATGAGG-30. 



Additional files 



Additional file 1: Figure SI. In planta cleavage of PBl recognition sites 
by engineered endonucleases following heat-shock, resulting in loss of 
Pstl site. Agarose gel shows a Pstl screen of the remaining thirty two 
JJS24 samples before and after heat shock. PGR fragments from samples 
before heat shock (-) are cut > 90% into product bands (identified as 
"Pstl cut PGR" on right side of gel). After heat shock (+), the PGR 
fragments from the three samples are largely uncut by Pstl, indicating a 
loss of the Pstl site in plonta. Plant samples that demonstrated a 
significant resistance to cleavage by Pstl after heat-shock are indicated 
with a "*". Sequence analysis of these cloned PGR fragments (*) 
confirmed the loss of the Pstl site and reconstitution of a single PBl 
recognition site. 

Additional file 2: Figure S2. Induction of the PB1+ endonuclease 
removes the BAR gene from Arabidopsis plants. The two gels show the 
PGR analysis of all twenty four JJS30 transformants. Genomic DNA 
samples were taken from twenty four JJS30 transformants (first twelve 
represented in Figure 3B) before and after heat-shock, and evaluated by 
PGR using the primers shown in Figure 3A. The unmodified JJS30 T-DNA 
is expected to yield a PGR product approximately 1200 bp in length (BAR 
+ arrow), whereas JJS30 lacking the BAR gene is expected to be 
approximately 300 bp (BAR- arrow). 

Additional file 3: Table SI. DNA sequences of individual clones 
containing PGR-amplified repair junctions from ten different plants 
following BAR expression cassette removal. 

Additional file 4: Figure S3. Analysis of BAR removal in T2 generation 
arising from heat-shocked JJS30 Ti Arabidopsis plants. Following heat- 
shock and recovery, Ti (primary transformants) Arabidopsis plants were 
allowed to self-pollinate. The resulting progeny were grown on medium 
with kanamycin to select for the JJS30 T-DNA and screened for Basta® 
resistance by painting a leaf with Basta®. Genomic DNA was extracted 
from plants that appeared to be Basta® sensitive and the region 
encompassing the BAR expression cassette was amplified by PGR. PGR 
fragments were resolved on a 1.5% agarose gel looking for 
homogeneous BAR minus T-DNA. Samples 8, 15, and 19 appear to lack a 
copy of the BAR cassette. Samples 6, 7, 14, and 16 appear to have an 
equal mixture of T-DNAs with and without the BAR cassette. These 
samples may contain two T-DNAs or may have resulted from BAR 
removal in the generation by leaky expression of the PB1 + 
endonuclease. Finally, samples 1, 2, 4, 5, 9, 10, 1 1, 12, 13, 17, and 18 
appear to only contain an intact BAR cassette. These plants may have 
been incorrectly identified as sensitive with our Basta® painting screen, 
and/or they may have silenced expression of the BAR gene. The PGR 
fragments from samples 8, 15, and 19 were cloned and eight individual 
clones for each sample were sequenced to determine if they are truly 
homogeneous. In each case, all eight clones had the same sequence, 
indicating that the plants are not chimeric, unlike their parental Ti plants. 
Sample 8 had a small insertion and deletion at the repair junction. 
Sample 15 had a conservative repair junction with a reconstituted 
recognition site. Sample 19 appears to be a recombination event with 
another T-DNA. 
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